Video memory management

ABSTRACT

A video memory manager manages and virtualizes memory so that an application or multiple applications can utilize both system memory and local video memory in processing graphics. The video memory manager allocates memory in either the system memory or the local video memory as appropriate. The video memory manager may also manage the system memory accessible to the graphics processing unit via an aperture of the graphics processing unit. The video memory manager may evict memory from the local video memory as appropriate, thereby freeing a portion of local video memory use by other applications. In this manner, a graphics processing unit and its local video memory may be more readily shared by multiple applications.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority of U.S. Provisional ApplicationSerial No. 60/448,399 entitled “Video Memory Manager Architecture,”filed Feb. 18, 2003.

FIELD OF THE INVENTION

[0002] The invention relates generally to the field of computing, and,more particularly, to a technique for performing video memory managementand virtualizing video memory.

BACKGROUND OF THE INVENTION

[0003] The use of graphics in computers has increased dramatically overthe years due to the development of graphics based user-friendlyapplication programs and operating systems. To support the computingrequirements associated with graphics, computer component manufacturershave developed specialized graphics processing units (GPUs) to offloadsome of the intense graphics computing demands from the centralprocessing unit (CPU) to these specialized GPUs. Many of these GPUs areimplemented on a Peripheral Component Interconnect (PCI) compatible cardand include local graphics memory (also referred to herein as videomemory) on the card itself. This local video memory enables the GPU toprocess graphics more quickly.

[0004] Current operating systems typically grant GPU resources (e.g.,video memory) on a first come-first served basis. If one application hasbeen allocated all of the GPU resources (e.g., the entire local memoryof the GPU), then other applications may not be able to run or they mayrun with errors. As the use of GPUs may become more prevalent, there isa need for techniques for more fairly allocating GPU resources amongapplications.

SUMMARY OF THE INVENTION

[0005] A video memory manager manages and virtualizes memory so that anapplication or multiple applications can utilize both system memory andlocal video memory for processing graphics with a graphics processingunit. The video memory manager allocates memory in either the systemmemory or the local video memory as appropriate. The video memorymanager may also manage system memory accessible to the graphicsprocessing unit via an aperture of the graphics processing unit. Thevideo memory manager may also evict memory from the local video memoryas appropriate, thereby freeing a portion of local video memory use byother applications. In this manner, a graphics processing unit and itslocal video memory may be shared by multiple applications.

[0006] The video memory manager may distinguish between various types ofgraphics data and treat them differently. For example, resources may bedistinguished from surfaces. Resources may be stored in a kernel mode ofthe operating system. Surfaces may be stored in a user mode processspace of the operating system. Surfaces may be classified as eitherstatic or dynamic, depending on whether the central processing unit hasdirect access to the surface.

[0007] The video memory manager may use a fencing mechanism, forexample, a monotonic counter, to determine information about the statusof the graphics processing unit. The graphics processor may incrementthe counter for each command buffer processed. The video memory managermay determine whether a surface has been used or is about to be used byreading the counter.

[0008] Memory allocation may be divided into big and small memoryallocations and treated differently. Big memory allocations may useentire dedicated pages. Small memory allocations may share a single pageto conserve memory.

[0009] Other features are described below.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] The foregoing summary, as well as the following detaileddescription of illustrative embodiments, is better understood when readin conjunction with the appended drawings. For the purpose ofillustration, there is shown in the drawings illustrative embodiments ofinvention; however, the invention is not limited to the specificembodiments described. In the drawings:

[0011]FIG. 1 is a block diagram of an illustrative computing environmentin which aspects of the invention may be implemented;

[0012]FIG. 2 is a block diagram showing more illustrative details of thecomputing environment of FIG. 1 in which aspects of the invention may beimplemented;

[0013]FIG. 3 is a block diagram of a video memory manager in accordancewith an embodiment of the invention;

[0014]FIG. 4 is a block diagram of an illustrative addressable entitywhich may be addressed by a video memory manager in accordance with anembodiment of the invention;

[0015]FIG. 5 is a block diagram of a write request on an addressableentity, showing the resulting mapping modification;

[0016]FIG. 6 is a block diagram of a read request on the modifiedmapping produced by the read request of FIG. 5;

[0017]FIG. 7 is a block diagram of a random access memory;

[0018]FIG. 8 is a block diagram of an illustrative paging scheme forvideo memory management in accordance with an embodiment of theinvention;

[0019]FIG. 9 is a block diagram of another illustrative addresstranslation mechanism for video memory management, which is adapted foruse with the illustrative paging scheme of FIG. 8;

[0020]FIG. 10 is a block diagram of an illustrative segmentation schemefor video memory management in accordance with an embodiment of theinvention;

[0021]FIG. 11 is a block diagram showing an illustrative dynamic videomemory allocation in accordance with an embodiment of the invention;

[0022]FIG. 12 is a block diagram showing an illustrative state diagramincluding illustrative states of dynamic video memory allocation inaccordance with an embodiment of the invention;

[0023]FIG. 13 is a flow diagram of an illustrative method for videomemory management in accordance with an embodiment of the invention;

[0024]FIG. 14 is a flow diagram of another illustrative method for videomemory management in accordance with an embodiment of the invention;

[0025]FIG. 15 is a diagram depicting an illustrative usage of a fence invideo memory management in accordance with an embodiment of theinvention;

[0026]FIG. 16 is a block diagram showing an illustrative static videomemory allocation in accordance with an embodiment of the invention;

[0027]FIG. 17 is a block diagram showing an illustrative heap managementof physical memory in accordance with an embodiment of the invention;and

[0028]FIG. 18 is a block diagram of an illustrative aperture memorymanagement in accordance with an embodiment of the invention.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

[0029] Computer System

[0030]FIG. 1 shows an illustrative computing environment 100 in whichaspects of the invention may be implemented. Computing environment 100is only one example of a suitable computing environment and is notintended to suggest any limitation as to the scope of use orfunctionality of the invention. Neither should the computing environment100 be interpreted as having any dependency or requirement relating toany one or combination of components illustrated in the illustrativeenvironment 100.

[0031] The invention is operational with numerous other general purposeor special purpose computing system environments or configurations.Examples of well known computing systems, environments, and/orconfigurations that may be suitable for use with the invention include,but are not limited to, personal computers, server computers, hand-heldor laptop devices, multiprocessor systems, microprocessor-based systems,set top boxes, programmable consumer electronics, network PCs,minicomputers, mainframe computers, embedded systems, distributedcomputing environments that include any of the above systems or devices,and the like.

[0032] The invention may-be described in the general context ofcomputer-executable instructions, such as program modules, beingexecuted by a computer. Generally, program modules include routines,programs, objects, components, data structures, etc. that performparticular tasks or implement particular abstract data types. Theinvention may also be practiced in distributed computing environmentswhere tasks are performed by remote processing devices that are linkedthrough a communications network or other data transmission medium. In adistributed computing environment, program modules and other data may belocated in both local and remote computer storage media including memorystorage devices.

[0033] With reference to FIG. 1, an illustrative system for implementingthe invention includes a general purpose computing device in the form ofa computer 110. Components of computer 110 may include, but are notlimited to, a processing unit 120 (e.g., central processing unit CPU120), a system memory 130, and a system bus 121 that couples varioussystem components including the system memory 130 to the processing unit120. The system bus 121 may be any of several types of bus structuresincluding a memory bus or memory controller, a peripheral bus, and alocal bus using any of a variety of bus architectures. By way ofexample, and not limitation, such architectures include IndustryStandard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus,Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA)local bus, and Peripheral Component Interconnect (PCI) bus (also knownas Mezzanine bus).

[0034] Computer 110 typically includes a variety of computer readablemedia. Computer readable media can be any available media that can beaccessed by computer 110 and includes both volatile and nonvolatilemedia, removable and non-removable media. By way of example, and notlimitation, computer readable media may comprise computer storage mediaand communication media. Computer storage media includes both volatileand nonvolatile, removable and non-removable media implemented in anymethod or technology for storage of information such as computerreadable instructions, data structures, program modules or other data.Computer storage media includes, but is not limited to, RAM, ROM,EEPROM, flash memory or other memory technology, CDROM, digitalversatile disks (DVD) or other optical disk storage, magnetic cassettes,magnetic tape, magnetic disk storage or other magnetic storage devices,or any other medium which can be used to store the desired informationand which can be accessed by computer 110. Communication media typicallyembodies computer readable instructions, data structures, programmodules or other data in a modulated data signal such as a carrier waveor other transport mechanism and includes any information deliverymedia. The term “modulated data signal” means a signal that has one ormore of its characteristics set or changed in such a manner as to encodeinformation in the signal. By way of example, and not limitation,communication media includes wired media such as a wired network ordirect-wired connection, and wireless media such as acoustic, RF,infrared and other wireless media. Combinations of any of the aboveshould also be included within the scope of computer readable media.

[0035] The system memory 130 includes computer storage media in the formof volatile and/or nonvolatile memory such as read only memory (ROM) 131and random access memory (RAM) 132. A basic input/output system 133(BIOS), containing the basic routines that help to transfer informationbetween elements within computer 110, such as during start-up, istypically stored in ROM 131. RAM 132 typically contains data and/orprogram modules that are immediately accessible to and/or presentlybeing operated on by processing unit 120. By way of example, and notlimitation, FIG. 1 illustrates operating system 134, applicationprograms 135, other program modules 136, and program data 137. Systemmemory 130 may be separated into kernel memory (which is a memoryprotected by the operating system 134) and application or process memory(which is a memory used by application programs 135 and is subject toless protection than kernel memory).

[0036] The computer 110 may also include other removable/non-removable,volatile/nonvolatile computer storage media. By way of example only,FIG. 1 illustrates a hard disk drive 141 that reads from or writes tonon-removable, nonvolatile magnetic media, a magnetic disk drive 151that reads from or writes to a removable, nonvolatile magnetic disk 152,and an optical disk drive 155 that reads from or writes to a removable,nonvolatile optical disk 156, such as a CD ROM or other optical media.Other removable/non-removable, volatile/nonvolatile computer storagemedia that can be used in the illustrative operating environmentinclude, but are not limited to, magnetic tape cassettes, flash memorycards, digital versatile disks, digital video tape, solid state RAM,solid state ROM, and the like. The hard disk drive 141 is typicallyconnected to the system bus 121 through a non-removable memory interfacesuch as interface 140, and magnetic disk drive 151 and optical diskdrive 155 are typically connected to the system bus 121 by a removablememory interface, such as interface 150.

[0037] The drives and their associated computer storage media discussedabove and illustrated in FIG. 1, provide storage of computer readableinstructions, data structures, program modules and other data for thecomputer 110. In FIG. 1, for example, hard disk drive 141 is illustratedas storing operating system 144, application programs 145, other programmodules 146, and program data 147. Note that these components can eitherbe the same as or different from operating system 134, applicationprograms 135, other program modules 136, and program data 137. Operatingsystem 144, application programs 145, other program modules 146, andprogram data 147 are given different numbers here to illustrate that, ata minimum, they are different copies. A user may enter commands andinformation into the computer 20 through input devices such as akeyboard 162 and pointing device 161, commonly referred to as a mouse,trackball or touch pad. Other input devices (not shown) may include amicrophone, joystick, game pad, satellite dish, scanner, or the like.These and other input devices are often connected to the processing unit120 through a user input interface 160 that is coupled to the systembus, but may be connected by other interface and bus structures, such asa parallel port, game port or a universal serial bus (USB). A monitor191 or other type of display device is also connected to the system bus121 via an interface, such as a video interface 190. In addition to themonitor, computers may also include other peripheral output devices suchas speakers 197 and printer 196, which may be connected through anoutput peripheral interface 195.

[0038] The computer 110 may operate in a networked environment usinglogical connections to one or more remote computers, such as a remotecomputer 180. The remote computer 180 may be a personal computer, aserver, a router, a network PC, a peer device or other common networknode, and typically includes many or all of the elements described aboverelative to the computer 110, although only a memory storage device 181has been illustrated in FIG. 1. The logical connections depicted in FIG.1 include a local area network (LAN) 171 and a wide area network (WAN)173, but may also include other networks. Such networking environmentsare commonplace in offices, enterprise-wide computer networks, intranetsand the Internet.

[0039] When used in a LAN networking environment, the computer 110 isconnected to the LAN 171 through a network interface or adapter 170.When used in a WAN networking environment, the computer 110 typicallyincludes a modem 172 or other means for establishing communications overthe WAN 173, such as the Internet. The modem 172, which may be internalor external, may be connected to the system bus 121 via the user inputinterface 160, or other appropriate mechanism. In a networkedenvironment, program modules depicted relative to the computer 110, orportions thereof, may be stored in the remote memory storage device. Byway of example, and not limitation, FIG. 1 illustrates remoteapplication programs 185 as residing on memory device 181. It will beappreciated that the network connections shown are illustrative andother means of establishing a communications link between the computersmay be used.

[0040]FIG. 2 shows more details of the illustrative computingenvironment 100 of FIG. 1. As shown in FIG. 2, video interface 190includes a graphics processing unit (GPU) 290. GPU 290 typicallyincludes a specialized processor for processing graphics. GPU 290typically includes a graphics pipeline for high-speed processing ofgraphics information. Inclusion of GPU 290 in computer 110 may allowoffloading of the intense graphics computational demands from CPU 120.As shown, GPU 290 includes video memory 291. Video memory 291 may storegraphics data and information useful for generating graphics for displayon monitor 191.

[0041] Video interface 190 communicates with other devices in computingenvironment 100 via Peripheral Component Interconnect (PCI) controller240 and chipset 250. GPU 290 may include an aperture 292 that functionsas a high-speed “window” into system memory 130. That is, aperture 292of GPU 290 maps to corresponding system memory 130 and allows GPU 290 toview system memory 130 via a virtual memory addressing scheme. Thisallows GPU 290's view of a memory allocation to appear contiguous, eventhough the memory allocation may actually be located in discontiguousphysical system memory pages.

[0042] Video memory manager 200 may provide address translation for GPU290, thereby virtualizing memory for GPU 290. Video memory manager 200may include an address translation mechanism to convert between virtualaddresses and physical memory addresses. In this manner, GPU 290 may bemore easily shared between multiple applications at the same time. Videomemory manager 200 (also referred to herein as VidMm) may reside in akernel mode component of operating system 134.

[0043] Application program 135 may access various components ofcomputing environment 100 via driver 210. Driver 210 may be implementedas two separate drivers, such as, for example, a user mode driver and akernel mode driver. The user mode driver (which is typically provided bya GPU supplier) is typically loaded in the private address space ofapplication 135. The user mode driver may request the creation anddestruction of memory allocation and generate their references. However,the user mode driver is typically not involved in the actual managementof allocations (e.g., the allocation of actual underlying resources,paging, eviction, and the like). The kernel mode driver (which istypically provided by a GPU supplier) is typically loaded in kernelspace of operating system 134. The kernel mode driver may interact withvideo memory manager 200 in the management of allocations. For example,when video memory manager 200 desires to evict an allocation from videomemory to system memory, video memory manager 200 may call the kernelmode driver, which in turn requests GPU 290 to perform some functionassociated with eviction.

[0044] Such virtualization is made possible because GPU 290 only needs asubset of the allocated memory to be present in local video memory 291or non-local video aperture 292 at any given time. For example whendrawing a triangle for an application, GPU 290 only uses the texture forthat triangle, not the entire set of texture used by the application.Thus video memory manager 200 may attempt to keep the correct subset ofgraphics content visible to GPU 290 and move unused graphics content toan alternative medium (e.g., system memory 130).

[0045] Video memory manager 200 may arbitrate the resources among theapplications by tracking the allocations made on behalf of every processand balancing resource usage among the processes. Video memory manager200 may implement the virtualization of memory through the use of avideo memory manager 200 created handle. Clients (e.g., application 135)of video memory manager 200 may reference addresses and allocationsthrough the use of the handle. In this manner, a client may not actuallyknow the physical address of the graphics data. Video memory manager 200may convert a given handle to a GPU visible address.

[0046]FIG. 3 shows more details of an illustrative video memory manager200. As shown in FIG. 3, video memory manager 200 includes virtualmemory manager 310, a physical memory manager 320, and a non-localaperture manager 330. Virtual memory manger 310 includes an addresstranslation mechanism 305 for virtualizing memory. Video memory manager200 may also determine information about the state of computingenvironment 100, indicated by environment information 310, in order tomake certain decisions about where to store memory, how to map tomemory, and the like. It should be noted that while the term“environment” may suggest that it contains information about the generalconditions present at the time some mapping is computed, it is notlimited to such information but rather may include any arbitrary type ofinformation. For example, the environment information 310 may includethe context of the operating system (which application is currentlyexecuting), and the like.

[0047] Video Memory Manager and Address Translation

[0048] Virtual memory manager 310 includes an address translationmechanism 305 which performs address mapping between a source 305 (e.g.,application 135, GPU 290, and the like) that requests data and a datastorage device containing the requested data (e.g., video memory 291).The requested data may be stored in video memory 291, system memory RAM132, system memory 132 and may be accessible via GPU aperture 292, harddisk 141, and other addressable entities.

[0049] Address translation mechanism 305 may perform various addressmapping functions between sources and addressable entities (e.g.,memory, etc.). FIG. 4 depicts a simple addressable entity 412(1), whereeach row of the table has a member of A on the left, and a member of Mon the right. Thus, in the example of FIG. 4, if f is the functiondefined by addressable entity 412(1), then f(‘a’)=17, f(‘b’)=6,f(‘c’)=3, and so on.

[0050] With reference to FIGS. 5 and 6, a write operation 502(“write(‘b’,14)”) on the simple addressable entity 412(1) changes themapping to 412(1)′, by changing the value “6” to “14” on the line whoseset “A” value is ‘b’. If read operation 602 (“read(‘b’)”) issubsequently performed on mapping 412(1)′, this read operation willreturn the value “14,” since write operation 502 has changed theoriginal mapping 412(1) such that the set A element ‘b’ now maps to theset M element “14”. As noted above, the semantics that allow a readoperation following a write operation to return the value that waswritten are illustrative, but not definitive, of an addressable entity.As discussed below, there are examples of addressable entities whoseread and write operations have different semantics.

[0051] Addressable entities include physical random access memory (e.g.,RAM 132, shown in FIG. 1). FIG. 7 shows an example of RAM 132 as anaddressable entity. RAM 132, in this example, comprises 2²⁸ bytes, eachhaving a physical address in the range 0 to 2²⁸−1. In this example, thevalue 17 is stored at address 0, 6 is stored at address 1, 137 is storedat address 2, and so on. Addressable entities also include controlregisters, CPU registers, and the like.

[0052] Address translation mechanism 305 may be based on paging andsegmentation schemes. FIGS. 8-10 depict examples of such schemes. Itshould be understood that pages and segments are a way of groupingaddressable entities and into “buckets” so they can be dealt withconveniently in large units.

[0053]FIG. 8 depicts an example of a paging scheme. In FIG. 8,fixed-sized portions of RAM 132 are designated as pages 806(1), 806(2),. . . 806(n). In the example of FIG. 8, each page is four kilobytes(4096 bytes) in length, although paging schemes are not limited to anyparticular page size (and some paging schemes support pages that havemore than one size—e.g., where a page can be either four kilobytes orfour megabytes in length). Each page has a base address in RAM 132. Thebase addresses of pages 806(1), 806(2), and 806(n) are 0x0000 0x2000,and 0xf000, respectively. (As will be recognized by those of skill inthe art, the prefix “0x,” by convention, indicates that a value is inhexadecimal, or base 16.) Within each page, each byte can be describedby an offset relative to the page's base address. Thus, within each pagethe first byte has offset 0, the second byte has offset 1, and so on.Since each page in the example of FIG. 8 is 4096 bytes in length, thelast byte of each page has offset 4095 (or 0x0fff).

[0054] Page table 808 is a list of pointers to the various pages 806(1)through 806(n). Each entry in page table 808 may also contain one ormore “attributes” as described above—i.e., a marker that indicateswhether the page pointed to by the pointer is read/write or read-only,or another marker that indicates whether the page is “present” in RAM132 or “not present.” (A page might be marked as not present if, say, ithad been swapped to disk to make room in RAM 132 for other data.) Eachelement of page table 808 contains the base address of a page in thepage table. Moreover, each element can be identified by an offset intothe page table. Thus, the element of page table 808 stored at offset 0is 0x0000, which is the base address of page 806(1); the element storedat offset 2 is 0x2000, which is the base address of page 806(2); and theelement stored at offset 5 is 0xf000, which the base address of offset806(n). Other offsets into page table 808 point to different pages thatare not depicted in FIG. 8. It should be noted that page table 808 istypically stored in RAM 132, and shown by the dashed line encompassingpage table 808.

[0055] Address translation mechanism 305 may use page table 808 toconvert a virtual address 802 into a physical address. Addresstranslation mechanism 305 may include hardware and software thatperforms various functions, including the translation of virtualaddresses into physical addresses. In the example of FIG. 8, virtualaddress 802 comprises two parts: a table offset 811 and a page offset812. Address translation mechanism 305 identifies a particular physicaladdress in RAM 132 based on virtual address 802. In order to identify aphysical address, address translation mechanism 305 first reads tableoffset 811, and uses this value as an index into page table 808. Next,address translation mechanism 305 retrieves whatever address appear inthe page table 808 entry defined by table offset 811, and adds pageoffset 812 to this value. The resulting value is the address of aparticular byte in one of the pages 806(1) through 806(n). In theexample of FIG. 8, table offset 811 is 0x0002. Thus, address translationmechanism 305 locates the base address stored at offset 2 from thebeginning of page table 808. In this case, that base address is 0x2000.Address translation mechanism 305 then adds page offset 812 to the valuelocated in the page table. Page offset 812, in this example, is also0x0002, so address translation mechanism 305 adds 0x2000+0x0002=0x2002,which is the physical address of the byte in page 806(2) that isindicated by slanted lines.

[0056] Address translation mechanism 305 may also be configured toperform some action based on the attribute(s) contained in the pagetable. For example, if the access request is to write to a byte ofmemory, and the page table entry for the page in which that byte islocated indicates that the page is read-only, then address translationmechanism 305 may abort the request and/or invoke some type of faulthandler. Similarly, if the byte is on a page marked as “not present,”then video memory manager 200 (or the memory manager of the operatingsystem) may take steps to copy the image of the page back into RAM 132from wherever that image is stored (e.g., disk), and/or may invoke sometype of fault handler.

[0057]FIG. 9 shows another illustrative address translation mechanism305. Address translation mechanism 305 implements a function that: (A)locates page table 808; (B) finds the offset in the page table indicatedby table offset 811; (C) retrieves the physical address stored at theoffset found in (B); (D) adds page offset 812 to the physical addressretrieved in (C); and (E) produces the result computed in (D). Thefunction implemented by address translation mechanism 305 may also takecertain action (e.g., aborting access requests, generating faults orexceptions, swapping pages into memory) depending on the state of theattributes, as described above.

[0058]FIG. 10 depicts an illustrative segmentation scheme. In thisexample, sections of RAM 132 called “segments” are delimited. FIG. 10shows four illustrative segments, 1006(0), 1006(1), 1006(2), and1006(3). Each segment has a base address and a length. Segments may havedifferent lengths. Segment table 1008 lists the base addresses andlengths of segments 1006(0) through 1006(3). Thus, segment 1006(0)begins at base address 0x0000 and has length 4096, segment 1006(1)begins at base address 0x4000 and has length 1024, and so on. Segmenttable 1008 is typically stored in RAM 132, as indicated by the dashedlines. Segment table 1008 may also list, for each segment, informationsuch as read-only/read-write, present/not-present, etc., as describedabove.

[0059] Address translation mechanism 305 converts a virtual address 1002into a physical address using segment table 1008. Virtual address 1002comprises a segment number 1011 and a segment offset 1012. Thus, in theexample of FIG. 10, address translation mechanism 305 uses segmentnumber 1011 as an offset into segment table 1008. In this example,segment number 1011 is “1”, so address translation mechanism 305 looksat offset 1 into segment table 1008, and locates the address 0x4000.Address translation mechanism 305 then adds segment offset 1012 (in thiscase 0x0000) to this address to create a physical address. Thus,0x4000+0x0000=0x4000. Thus, address translation mechanism 305 identifiesthe byte in segment 1006(1) indicated by slanted lines.

[0060] Moreover, the virtual address may include a field or bits toindicate which storage medium contains the physical memory. For example,a first field (of two bits) may have the value of one if the physicalmemory is in video memory 291, may have the value of two if the physicalmemory is in system memory 130 and not visible through non-localaperture 292, and may have the value of three is the physical memory isin system memory 130 visible through non-local aperture 292.

[0061] As seen, address translation allows video data to be stored invarious data storage devices and allows virtualization of video memory(e.g., video memory 291, non-local aperture 292, system memory 130).Video memory manager 200 may also perform memory management (includingmemory allocation/deallocation) to support the virtualization of memory.A video memory allocation is a collection of bits that holds somecontent for a surface. Before discussing memory management in detail, wedescribe various types of graphics data and resources for processinggraphics.

[0062] Surfaces

[0063] A surface represents a logical collection of bits allocated onbehalf of an application. The content of a surface (i.e., the logicalcollection of bits) is typically under the control of the application. Asurface may be constructed out of one or more video memory allocations.These video memory allocations may or may not be directly visible to theapplication even though the application can ultimately control thecontent. An example of a surface having more than one video memoryallocation is a palletized texture on hardware that doesn't support sucha type of texture. The driver could use one video memory allocation tohold the content of the texture in palletized mode, and use a secondvideo memory allocation to hold the content of the texture in expandedmode. Surfaces may be dynamic or static - the difference is how theapplication accesses the content of that surface.

[0064] A static surface is a surface for which the application doesn'thave direct CPU access to the bits of the surface, even though it cancontrol the content indirectly. An application may understand thelogical format of the surface and control the content, for example,through a GPU operation. ‘Static’ means that the content of the surfaceshould only change if those surfaces are the target of a GPU 290operation. Static surfaces may be used to allocate textures, vertexbuffers, render targets, z-buffers, and the like. A static surface mayinclude multiple static video memory allocations, described in moredetail below.

[0065] Dynamic surfaces are similar to static surfaces, except that anapplication can request to have direct CPU access to the bits of thesurface. Dynamic surfaces allow the application to access the content ofthe surface through GPU operation and through direct CPU access. Adynamic surface includes at least one dynamic video memory allocationand can include static video memory allocations, described in moredetail below.

[0066] Resources

[0067] A resource is a memory allocation (e.g., video memory) thatdriver 210 may use to support one or more applications but for which noapplication controls or should be allow to control the content directly.For example, when an application uses a vertex shader, the drivercompiles the shader into a GPU specific binary that is executed by theGPU. While the application controls the content of that bufferindirectly by specifying the vertex shader to use, the applicationdoesn't control the exact binary that get produced. For securityreasons, the content of those allocations are not typically madedirectly available to the application. A resource typically includes asingle physical video memory allocation. Resources include applicationresources and driver resources.

[0068] An application resource is a resource used by driver 210 tosupport a particular application but the resource can't be directlyaccessed by the application. If the resource fails, the applicationdoesn't work properly, but other applications continue to work properly.An example is a pixel shader binary compiled for a particularapplication's pixel shader code, an application GPU page table, and thelike.

[0069] Driver resources are resources that driver 210 uses to allow theoperation of all applications. The difference is that driver resourcesaren't bound to a particular application. A driver resource may be, forexample, the primary swap chain for the desktop.

[0070] Video Memory Allocation

[0071] As stated above, a video memory allocation is a collection ofbits that holds some content for a surface. A static video memoryallocation is a video memory allocation that, in general, is notdirectly accessed by CPU 120. A dynamic video memory allocation is avideo memory allocation that may be directly accessed by CPU 120, Adynamic surface, therefore, includes at least one dynamic allocationwhile a static surface does not include a dynamic allocation.

[0072] A physical video memory allocation is an allocated range in aparticular physical video memory segment of video memory 291.

[0073] A non-local aperture allocation is an allocated range in thephysical space controlled by non-local aperture 292. It should beunderstood that this type of allocation can't by itself hold anygraphics data. It's only a physical space allocation and that physicalspace in non-local aperture 292 is redirected to the system memory 130(e.g., pages holding the video memory allocation data).

[0074] Video Memory Manager

[0075] Video memory manager 200 performs various functions during memorymanagement, such as for example, allocation and deallocation of physicalmemory, allocation and deallocation of virtual memory, protection ofmemory, eviction of data from one data storage device to another, andthe like. Video memory manager 200 may use one or a combination of avirtual memory manager 310, a physical memory manager 320, and anon-local aperture manager 330 to perform various functions related tomemory management. While video memory manager 200 is shown as havingthree memory managers, video memory manager 200 may include any numberof memory managers and the functionality may be apportioned between thevarious memory managers in any convenient fashion.

[0076] Physical Memory Manager

[0077] Physical memory manager 320 manages physical video memory 291 anda portion of physical system memory 130. Physical memory manager 320attempts to find an appropriate free range of contiguous physical videomemory 291 when a video memory allocation is requested. When physicalvideo memory 291 is full, physical memory manager 320 (in conjunctionwith virtual memory manager 310) may evict data to system memory 130.Physical memory manager 320 may also determine which allocation to evictwhen physical video memory 291 is full. The address space of thephysical video memory 291 can be divided into one or more segments andeach segment may be managed separately as a linear heap, pages, and thelike. Driver 210 may decide how each segment should be managed.

[0078] The physical address space of GPU 290 may be divided intomultiple segments (referred to herein as physical video memory segments)that form the pool of available local video memory 291. Each physicalvideo memory allocation is allocated from one of those segments.Segmenting the physical address space of GPU 290 allows differentportions of video memory 291 to be treated differently. For example,only a subset of the address space might be visible through theaperture. Similarly, certain type of surfaces might only be allocatedfrom certain segments, and not others.

[0079] In heap management mode, physical memory manager 320 may create aheap the size of the segment and satisfy requests for memory allocationsby allocating a linear contiguous range in that heap. Physical memorymanager 320 may maintain for each segment a list of surfaces and a listof processes having commitment in the heap, as shown in FIG. 17. Thelist of allocations may be maintained in a least recently used (LRU)order. Each time driver 210 notifies physical memory manager 320 of theusage of an allocation, physical memory manager 320 puts that allocationat the end of the list for the segment in which it is allocated.Similarly, each time a surface is allocated in the segment, the processit's associated with is updated with information about how much memoryit has committed in that segment. These two pieces information may beused to implement an eviction policy.

[0080] When the segment is full and something needs to be allocated,physical memory manager 320 may chose as candidate for eviction, asfollows. First, check if some surfaces haven't been used for a long timeand move those surfaces to the eviction list (and add their memory tothe free list). Second, try allocating memory again. If successful,determine which allocation in the eviction list gets to be reused, evictthose allocations to system memory 130, and return a new physicaladdress to the caller. Third, trim all processes to the maximum workingset. For each process, move all the least recently used allocations tothe eviction list until that process's total committed memory is belowthe maximum working set. Fourth, try allocating memory again. Fifth,trim all processes to the minimum working set. For each process, moveall the least recently used allocations to the eviction list until thatprocess's total committed memory is below the maximum working set.Sixth, try allocating memory again. Seventh, scan the list ofallocations for that process in LRU order—if a block fits, use it.Eighth, try allocating memory again. Ninth, if the surface shouldn't beaggressively committed, return an error to the caller. Tenth, mark allallocations already committed for that process in the heap for eviction.Eleventh, try allocating memory again. Twelfth, mark all allocations inthe heap for every process (from the surface allocator) as ready foreviction. Thirteenth, try memory allocation again.

[0081] When marking surfaces for eviction, physical memory manager 320doesn't have to actually evict the surface at that moment—it can justreclaim the physical memory range (and remember the range in theeviction list). When memory is actually allocated for the newallocation, physical memory manager 320 may check the list to see whichsurface is currently located in that range. Then, physical memorymanager 320 may evict those surfaces from video memory 291 and trulyreclaim memory. Surfaces not actually evicted may remain on the evictionlist until the next eviction or until driver 210 references a surface,in which case it may be removed from the eviction list and put back atthe end of the allocated list. An illustrative API for use by driver 210to allocate physical memory for an application or a driver resource isgiven by: NTSTATUS VidMmAllocateContiguous(   IN PVOIDHwDeviceExtension,   IN VIDMM_SEGMENT Segment,   IN SIZE_T Size,   INULONG Alignment,   OUT PPHYSICAL_ADDRESS PhysAddress);

[0082] A surface allocator for dynamic and static surfaces may use aslightly different API to allocate physical memory, as shown below.NTSTATUS VidMmiAllocateContiguous(   IN PVOID HwDeviceExtension,   INVIDMM_SEGMENT Segment,   IN HANDLE hAlloc,   IN BOOLEAN Aggressive);

[0083] Illustrative APIs to free the memory are given by: NTSTATUSVidMmFreeContiguous(   IN PVOID HwDeviceExtension,   INPPHYSICAL_ADDRESS PhysAddress); NTSTATUS VidMmiFreeContiguous(   INPVOID HwDeviceExtension,   IN HANDLE hAlloc);

[0084] Non-local Aperture Manager

[0085] Non-local aperture manager 330 manages non-local aperture 292.Non-local aperture manager 330 doesn't actually “allocate” any memory;rather, non-local aperture manager 330 allocates a memory range inaperture 292 itself. Aperture 292 is really an address space and thusnon-local aperture manager 330 doesn't really allocate memory butallocates address space to be redirected (mapped) to some actual systemphysical memory in system memory 130. Non-local aperture manager 330 maymanage the space inside the aperture on a page basis. Once a range isallocated, non-local aperture manager 330 can lock a system memorysurface into place and map it through the non-local aperture 292.Non-local aperture manager 330 may call a driver responsible foraperture 292 to do the mapping on its behalf. FIG. 18 depictsillustrative management of non-local aperture 292. An illustrative APIis given below. NTSTATUS VidMmNonLocalMap(   IN PVOID HwDeviceExtension,  IN PVOID pvLin,   OUT PPHYSICAL_ADDRESS PhysAddr); NTSTATUSVidMmNonLocalUnMap(   IN PVOID HwDeviceExtension,   IN PHYSICAL_ADDRESSPhysAddr);

[0086] Virtual Memory Manager

[0087] Virtual memory manager 310 may perform dynamic and static videomemory allocations. Virtual memory manager 310, in effect, creates ahierarchy of data storage for graphics data. Thus, as described above, avideo memory allocation may not be resident in physical video memory291. Instead, the bits of a video memory allocation might be in physicalvideo memory 291, in physical system memory 130 (and may be visible ornot visible through aperture 292), or even on hard disk 141 accessiblevia the page file system of operating system 134.

[0088] Resource Allocation and Management

[0089] As described above, resources are important to the properrendering of graphics. As such, video memory manager 200 (in conjunctionwith virtual memory manager 310) may attempt to protect some memory(e.g., memory associated with a resource) from being corrupted by otherapplications. Some processors allow physical memory to be accesseddirectly, so an application program 135 (also referred to herein as aprocess) could execute an instruction to access a given physical addressregardless of whether that address had been assigned to the process'saddress space.

[0090] Video memory manager 200 may protect a video memory allocation byimplementing a process specific handle for each process, by allowingdirect CPU access only to video memory allocations owned by a specifiedprocess, and the like, described in more detail below.

[0091] Video memory manager 200 may also protect a video memoryallocation in system memory 130 by storing the video memory allocationin kernel memory while other (typically less critical) video memoryallocations may be stored in the private process address space of anapplication 135. Kernel memory is the area of memory used by operatingsystem 134 and provides protection against access by processes. That is,when allocating memory for a resource, video memory manager 200 (e.g.,via virtual memory manager 310 and physical memory manager 320) mayallocate memory in the kernel memory portion of system memory 130 ifthere is not appropriate space in video memory 291. Also, video memorymanager 200 may store the actual mappings from handles or virtualaddresses to actual physical addresses in kernel memory to protect themappings from being accessed by other applications, etc. Further, videomemory manager 200 (e.g., via virtual memory manager 310 and physicalmemory manager 320) may evict resource video memory allocations to thekernel memory portion of system memory 130 and adjust the virtual memorymappings accordingly.

[0092] Alternatively, video memory manager 200 may not evict anyresources, but maintain all resources in video memory 291. This type ofallocation may be offered to driver 210 by means of directly allocatingphysical video memory 291 that is not evictable. In such a case, driversshould keep the number of such allocations small, otherwise physicalvideo memory 291 may get filled with unevictable allocations.

[0093] When visible through the non-local aperture 292, the video memoryallocation may be locked in system memory (e.g., usingMnProbeAndLockPages( ) mechanism) and mapped through non-local aperture292. In this state, the bits of the video memory allocation still residein the page file system but should remain present in physical systemmemory 130 because of the locking operation. To map the video memoryallocation through the non-local aperture 292, a range is allocated inthe aperture 292 itself, referred to herein as a non-local apertureallocation.

[0094] Application Access to Graphics Data

[0095] When application 135 sends a rendering command to driver 210 thatreferences an allocation, driver 210 informs video memory manager 200about the reference so that video memory manager 200 can load thesurface in some accessible physical memory for GPU 290. If the surfaceis currently in system memory 130, video memory manager 200 may look atflags of the surface and allocates the proper GPU resource (e.g., someaddress range of non-local aperture 292 or some address range of localvideo memory 291). If the surface was allocated in video memory 291,then the video memory manager 200 allocates memory from the physicalmemory manager 320. If the surface was allocated in non-local aperture292, then the video memory manager 200 sends the virtual address of theallocation's system memory buffer to the non-local aperture allocator330 which may lock the memory and map the memory through non-localaperture 292.

[0096] Static Video Memory Allocation and Management

[0097]FIG. 18 illustrates forming a static video memory allocation. Whenstored in system memory 130, static video memory allocation may residein the private address space of the associated application. Allowing theapplication to directly access the bits of the static video memoryallocation is typically acceptable because the application can directlycontrol the content anyway and so any graphics data corruption shouldonly affect that application and should not hang GPU 290.

[0098] In theory, video memory manager 200 could allocate a static videomemory in system memory 130 only when the allocation is evicted tosystem memory 130 and could free the corresponding portion of systemmemory 130 when the allocation resides in local video memory 291. Adisadvantage with this approach is that the virtual address space of theapplication is also used by the application itself for regular memoryallocation. Thus, there is no guarantee that video memory manager 200could allocate space in the private address space of the application forthe static video memory allocation upon an eviction from video memory291. Therefore, video memory manager 200 may keep the static videomemory allocation of system memory 130 to save space for an evictionfrom physical video memory 291.

[0099] When video memory 291 is full, video memory manager 200 may evicta static allocation to make place for a new allocation. In such a case,video memory manager 200 brings the content of video memory 291 back tosystem memory 130. If the surface hasn't been modified since it wascached from system memory 130, then the content of video memory 291 maybe discarded. If the content was modified, then non-local aperturemanager 330 may map the system memory allocation through non-localaperture 292 and request driver 210 to transfer the content of videomemory 291 to that buffer. Once the transfer is completed, the surfaceis unmapped from non-local aperture 292.

[0100] If the surface is currently mapped through non-local aperture292, the eviction is relatively easy. As explained before, an allocationvisible through the non-local aperture 292 has its virtual addressreferencing the pages in system memory 130. The pointer remains the samewhether or not non-local aperture 292 is redirecting GPU 290 to the samephysical pages. Because of this, removing the redirection range innon-local aperture 292 has no effect on the application accessing thesurface through the CPU page table. Thus, to evict the surface fromnon-local aperture 292, video memory manager 200 reclaims the previouslyreserved range in aperture 292 that was being redirected to thatallocation and unlocks the page from system memory 130 so the operatingsystem memory manager can page them out to hard disk 141. That is, videomemory manager 200 may unmap unused allocations from non-local aperture292. The ranges of non-local aperture 292 that were unmapped can then bereclaimed by video memory manager 200 (and subsequently reused for otherallocations to be accessed by GPU 290)

[0101] Evicting from physical video memory 291 is more complex thanevicting from non- local aperture 292. When the eviction occurs whilethe surface is in video memory 291, video memory manager 200 allocatespages in system memory 130 for the allocation, copies the content of thesurface from video memory 291 to these allocated pages, and remaps theuser mode virtual address to reference the newly allocated pages. Thisentire process should occur while the application can actually beaccessing the virtual address that needs to be copied and remapped. Thismay be handled by the memory manager of the operating system through theAPI MmRotatePhysicalView( ). This API allows rotation of the virtualaddress from a physical video memory location to a system memorylocation as an atomic operation as seen by the application.

[0102] Static allocations may be allocated from a heap that is createdin each process the first time a static allocation is requested. Theheap may be managed like a regular heap and the surfaces allocated asregular system memory. The linear address from the heap allocation maybe associated with that allocation for it's life. Allocating a staticbuffer may include allocating the surface in the process video heap.Since there is no content for the surface at creation time, there is noneed to actually allocate any video memory 291 or system memory 130viewable through non- local aperture 292 at that time.

[0103] A memory heap is a range of virtual space, in the process privatevirtual memory space, for allocation of virtual memory. Typically, eachvideo memory allocation gets a small portion of the heap. The heap maygrow over time and can actually include multiple ranges of virtual spaceif the original range can't be grown. A heap may be used to reducefragmentation of the address space of the application. The heap may beallocated as a rotatable virtual address range. In a rotatable range,video memory manager 200 can specify for each page of the heap, whetherto refer to a location in the frame buffer or to be backed by a page ofsystem memory 130.

[0104] Dynamic Video Memory Allocation and Management

[0105] Dynamic video memory allocations use a medium to hold the bits ofthe allocation and a virtual address referring those bits. Virtualmemory manager 310 may use either physical video memory 291 or systemmemory 130 to hold the bits of a dynamic video memory allocation. Whilein physical video memory 291 the dynamic video memory allocation isassociated with a physical video memory allocation (from physical videomemory manager 320). In this state, the video memory allocation isdirectly visible to GPU 290 and can be used for rendering operations.

[0106] When the bits of the allocation are evicted from video memory291, or mapped through the non-local aperture 292, video memory manager200 allocates a portion of system memory 130 to store those bits. Thesystem memory could potentially be allocated from either the kernelmemory or the process space of the application. Since kernel memory is alimited resource that is shared among all applications, video memorymanager 200 allocates from the process space of the application. Becausesystem memory is allocated from the process space of the application, anapplication can access the bits of that allocation directly withoutgoing through the locking mechanism. Because the application controlsthe content of those allocations anyway, this isn't a securityviolation. This may result in unknown data being present on thoseallocations (which may result in a rendering artifact), but it typicallywon't affect other applications or hang GPU 290.

[0107] When the bits of an allocation reside in system memory 130, theycan't be directly accessed by GPU 290 unless the physical system pagesforming the buffer of system memory are made visible through non-localaperture 292. In that state, the dynamic video memory allocation will beassociated with a range of non-local aperture address space allocated bythe non-local aperture manager 330. The non-local aperture hardware ofGPU 290 redirects that address space to the appropriate physical pagesin system memory 130.

[0108] In theory, the virtual address referring to the bits of theallocation is used only when the application accesses those bits or whenthe surface is in system memory 130 (to hold the content of theallocation). Thus, when the surface is currently cached in video memory291 and the surface isn't being accessed by the application, the virtualaddress isn't needed. However not having a virtual address associatedwith the allocation at all time may cause a problem when video memorymanager 200 transitions the allocation from one state to another becauseit might not be able to allocate that virtual address if the applicationprocess space doesn't contain a range large enough for the allocation.In that case, it is possible that a surface couldn't be evicted fromvideo memory 291 because of not enough free memory in system memory 130.

[0109] For this reason, video memory manager 200 may associate a virtualaddress to a dynamic video memory allocation when it's first allocatedand store the virtual address as long as the allocation exists. Thisway, video memory manager 200 has the virtual address when changing thestate of the allocation. Similarly, that virtual address is typicallycommitted up front rather than waiting until eviction time.

[0110]FIG. 11 illustrates forming a dynamic video memory allocation. Fora dynamic video memory allocation, a locking mechanism is available toapplications to allow them to directly access the virtual addressallocated inside their process address space. The virtual address canreference actual physical video memory 291 (visible through the pciframe buffer aperture) or the physical system pages of system memory130.

[0111] The application 135 may call the Lock( ) function to obtain thevirtual address. When the application is done with the access, it maycall the Unlock( ) function to allow GPU operations on the allocation toresume. The application may call the Lock( ) function before accessingthe content of the allocation to insure that the driver had a chance toflush all graphics operations for that allocation. Graphics operations(or dma) referencing the allocation should not be sent to GPU 290 whilethe surface is being accessed by the application.

[0112] The application 135 typically cannot determine the actualphysical location of the allocation when it's accessing it through thevirtual address. Furthermore, the actual physical location can bemodified while the allocation is being accessed. Video memory manager200 could decide, for example, to evict the surface being accessed outof video memory 291 to make room for another allocation. This evictionprocess is transparent to application 135 and doesn't result in any lossof content.

[0113] The granularity of the virtual address may define the granularityof the allocation to protect each process's video memory from oneanother. Similarly, because of the way virtual memory works, the lower“n” bits of a virtual address are really the offset within the physicalpage where the bits are being held. Thus those “n” lower bits of thevirtual address are the same as the “n” lower bits of the physicaladdress, which means that once a surface has been allocated at aspecified offset within a page it remains at that relative offset withinthe new medium even if remapped to a new location. For example, evictingan allocation out of video memory 291 while being accessed byapplication 135 implies having a virtual address in system memory 130that has the same lower “n” bits as the current location in physicalvideo memory 291. The same is true when bringing the surface back tovideo memory 291. Therefore, video memory manager 200 may find alocation in video memory 291 that has the same lower “n” bits as thevirtual memory for that allocation.

[0114] One mechanism to allocate the virtual address associated with adynamic video memory allocation may be a memory manager of operatingsystem 134 (also referred to herein as Mm) that supports a rotatablevirtual address description (VAD). When the content of the allocationisn't present in physical video memory 291, the VAD may be rotated toregular pageable system memory 130. When the allocation is brought in tophysical video memory 291, the VAD is MEM_RESET so that Mm can reuse thephysical pages that were used without transferring the content to thepage file on disk. At the first lock operation, the VAD is rotated tothe physical memory location where the surface resides in physical videomemory 291. The VAD isn't rotated back on an unlock, instead the VADreferencing the physical video memory location is stored until theallocation is either moved in video memory 291, freed or evicted tosystem memory 130.

[0115] Using this mechanism, video memory manager 200 can control thevirtual address space of the application on the natural page size ofcomputing environment 100 (e.g., 64 K), which means that allocations areexpanded to the next page size. To reduce the impact of this expansion,video memory manager 200 may distinguish between big allocations andsmall allocations. Video memory manager 200 may align a big allocationto the natural page size and video memory manager 200 may pack smallallocations inside of chunks to conserve space. The chunks may bemanaged by the video memory manager 200 similar to regular dynamic videomemory allocations. When video memory manager 200 changes the state ofone surface within the chunk, it may change the state of all thesub-allocations. A virtual memory chunk is a range of virtual space inthe process private virtual memory space. It is similar to a processvideo memory heap except that it typically holds only a few surfaces.The surfaces in the virtual memory chunk may be moved in and out oflocal video memory 291 by video memory manager 200.

[0116]FIG. 12 illustrates a state diagram showing illustrative states ofa dynamic video memory allocation. In the initial state (state zero),the dynamic video memory allocation is allocated but doesn't have acontent yet. Thus, the content of the allocation is unknown. Ifapplication 135 uses the allocation as the source of a GPU operation,the result of the rendering will be unknown. To get content into theallocation, application 135 can use GPU 290 to render into it (whichbrings the allocation to state one) or application 135 can lock thesurface and manually put content into the allocation using the CPU(which brings the allocation to state six).

[0117] In state one, the bits of the allocation reside in physical videomemory 291. In this case, the dynamic video memory allocation isassociated with a physical video memory allocation from the physicalvideo memory manager 320. In state one, there doesn't need to be avirtual address referring to physical video memory 291 as the allocationdoesn't need to be accessible. Physical memory could be allocated from asegment that is visible or not visible to CPU 120. From state one, theallocation can be locked by application 135 for direct CPU access orevicted out of video memory 291 to make room for another allocation. Instate one, the rotatable VAD for the allocation could be eitherreferring to system memory, if the allocation hasn't be locked yet atit's current location, or rotated to the physical video memory locationotherwise.

[0118] In state two, the bits of the allocation reside in physical videomemory 291 and the rotatable VAD is currently rotated to the physicalvideo memory location where the allocation resides. Thus, the bits ofthe allocation can be allocated from a segment that is visible to CPU120. If the surface was originally allocated from a segment not visibleto CPU 120 (e.g., in state one) the allocation may be moved to a segmentthat is visible before the allocation reaches state two. While in statetwo, application 135 typically does not send rendering commands to GPU290 referring to the allocation. First, application 135 relinquishes itshold on the virtual address associated with the surface. While in statetwo, the surface can still get evicted to system memory 130. In thiscase, the VAD is rotated to system memory 130 and the memory manager ofthe operating system may ensures that this process appears atomic toapplication 135 (i.e. the application's access to the virtual addresswill continue normally during the transfer and no content is lost).

[0119] In state three, the bits of the allocation reside in regularpageable system memory 130. Thus, the dynamic video memory allocation isno longer associated with a physical video memory allocation. In statethree, the rotatable VAD may be rotated back to system memory 130. Eventhough a virtual address to the bits of the allocation are accessible byapplication 135, application 135 should not try to access those bitswhile in state three since the runtime may not synchronize accesses witha GPU rendering command. If application 135 requests a rendering commandreferencing the allocation, the allocation is brought back to state oneor four before GPU 290 can access the allocation.

[0120] In state four, video memory manager 200 has decided to make theallocation visible through non-local aperture 292. In state four, thebits of the allocation still reside in system memory 310 (e.g., VADrotated to system memory 130). However, the pages are locked in placeand cannot be paged to hard disk 141. The GPU can access the allocationthrough the non-local aperture range directly. Similar to state three,an application should not use the virtual address referencing theallocation directly as this virtual address isn't guaranteed to containdata in the format the application expects or even be valid (e.g., theallocation could transition into another state).

[0121] In state five, the allocation is visible through non-localaperture 292, however, the application may directly access the virtualaddress referring to the physical system pages of the allocation. Whilein state five, video memory manager 200 keeps the virtual address validand may refuse any GPU rendering command referencing the allocationuntil application 135 relinquishes its hold on the allocation. In thiscase, evicting the surface out of non-local aperture 292 doesn't haveany consequences on application 135 because the virtual address remainsthe same except that the non-local aperture 292 no longer redirects arange to physical system pages referred to by the virtual address.

[0122] In state six, the allocation is in system memory 130, like statethree, except that application 135 may access the bits of the allocationdirectly. In state six, application 135 shouldn't send renderingcommands to GPU 290 that reference the allocation (the applicationshould first relinquish hold of the virtual address).

[0123]FIG. 13 shows an illustrative method 1000 for video memorymanagement. While the description of FIGS. 13 and 14 refer to variousmanagers (e.g., physical memory manager 320, etc.) it should beappreciated that the method could be implemented with a single manager,or any number of managers. Further, the various functionalities may bedistributed among the various managers in any convenient fashion.

[0124] As shown in FIG. 13, at step 1010, virtual memory manager 310allocates virtual memory for referencing some physical memory, which inturn stores the graphics data.

[0125] At step 1020, physical memory manager 320 allocates the physicalmemory to store the graphics data. The physical memory may be located invideo memory 291, may be located in system memory 130 and not accessiblevia aperture 292, may be located in system memory 130 and accessible viaaperture 292, and the like.

[0126] At step 1030, virtual memory manager 310 maps from the virtualaddress allocated in step 1010 to the physical address allocated at step1020. In this manner, by working with virtual addresses, application 135or driver 210 may request the graphics data without having to know wherethe graphics data is currently stored.

[0127] At step 1040, video memory manager 200 moves the graphics datafrom one physical location to another physical location, from beingmapped through aperture 292 to not being mapped through aperture 292,and the like. At step 1050, virtual memory manager 310 maps from thevirtual address to the “new” physical address.

[0128]FIG. 14 illustrates more details of step 1040. While FIG. 14 showsfour steps, each step may be individually executed, and the steps may beexecuted in any order. Video memory manager 200 may decide which step toexecute based on when a particular physical memory is full, based ontrying to balance GPU access between multiple applications, and thelike. As shown in FIG. 14, at step 1140, video memory manager 200 movesgraphics data to video memory 291. At step 1150, video memory manager200 evicts graphics data from video memory 291. At step 1160, videomemory manager 200 makes graphics data in system memory 130 accessiblethrough aperture 292. At step 1170, video memory manager 200 evictsgraphics data from being accessible through aperture 292. Video memorymanager 200 may execute each step differently depending on the type ofgraphics data.

[0129] For example, for a resource, such as an application resource or adriver resource, virtual memory manager 310 may allocate and commit akernel virtual address range for the resource. resource. To bring theresource to local video memory 291, physical memory manager 320 mayallocate memory in local video memory 291 for containing the resourceand may cause the resource to be copied from memory corresponding to thecommitted kernel virtual address range to the memory allocated in localvideo memory 291. Virtual memory manager 310 may map the kernel virtualaddress range to the memory allocated in local video memory 291.

[0130] To evict the resource from local video memory 291, physicalmemory manager 320 may cause the resource to be copied from the memoryallocated in local video memory 291 to memory corresponding to thecommitted kernel virtual address range and then free the memoryallocated in local video memory 291. Virtual memory manager 310 may freethe mapped kernel virtual address range.

[0131] To bring the resource “into” aperture 292, physical memorymanager 320 may lock the committed kernel virtual address range, wherebythe operating system does not have permission to page out the resourcefrom the system memory corresponding to the committed kernel virtualaddress range. Graphics processing unit aperture manager 330 mayallocate an address range in aperture 292 for redirection to theresource and may cause the address range in aperture 292 to be mapped tothe committed kernel virtual address range.

[0132] To evict the resource from aperture 292, graphics processing unitaperture manager 330 may unmap and free the address range allocated inaperture 292. Physical memory manager 320 may unlock the committedkernel virtual address range, whereby the operating system haspermission to page out the resource from the system memory correspondingto the committed kernel virtual address range.

[0133] Alternatively, a resource may be permanently allocated in videomemory 291 or aperture 292. To permanently allocate memory in videomemory 291, physical memory manager 320 may allocate memory for theresource in the local video memory and not evict the allocated memoryfrom local video memory. Virtual memory manager 310 may allocate andcommit a kernel virtual address range for the resource and map thekernel virtual address range to the memory allocated in local videomemory.

[0134] To permanently allocate memory in video memory 291, virtualmemory manager 310 may allocate and commit a kernel virtual addressrange for the resource. Physical memory manager 320 may lock thecommitted kernel virtual address range, whereby the operating systemdoes not have permission to page out the resource from the system memorycorresponding to the committed kernel virtual address range. Graphicsprocessing unit aperture manager 330 may allocate an address range inaperture 292 for redirection to the resource and cause the allocatedaddress range in aperture 292 to be mapped to the committed kernelvirtual address range.

[0135] For static surfaces, virtual memory manager 310 may allocate andcommit an application private virtual address range for the surface. Tobring the static surface into video memory 291, physical memory manager320 may allocate memory in local video memory 291 for containing thesurface and cause the surface to be copied from memory corresponding tothe committed application private virtual address range to the memoryallocated in local video memory 291.

[0136] To evict the static surface, physical memory manager 320 maycause the surface to be copied from the memory allocated in local videomemory 291 to memory corresponding to the committed application privatevirtual address range and then free the memory allocated in local videomemory 291.

[0137] To bring the static surface “into” aperture 292, physical memorymanager 320 may lock the committed application private virtual addressrange, whereby the operating system does not have permission to page outthe surface from the system memory corresponding to the applicationprivate virtual address range. Graphics processing unit aperture manager330 may allocate an address range in aperture 292 for redirection to thesurface and cause the address range in aperture 292 to be mapped to thecommitted application private virtual address range.

[0138] To evict the static surface from aperture 292, graphicsprocessing unit aperture manager 330 may unmap and free the allocatedgraphics processing unit aperture address range. Physical memory manager320 may unlock the committed application private virtual address range,whereby the operating system has permission to page out the surface fromthe system memory corresponding to the committed application privatevirtual address range.

[0139] For dynamic surfaces, virtual memory manager 310 may allocate andcommit an application private virtual address range for the surface. Tobring the dynamic surface into local video memory 291, physical memorymanager 320 may allocate memory in local video memory 291 for containingthe surface and cause the surface to be copied from the memorycorresponding to the committed application private virtual address rangeto the memory allocated in local video memory 291. Virtual memorymanager 310 may map the committed application private virtual addressrange to the memory allocated in local video memory 291.

[0140] To evict the dynamic surface from video memory 291, physicalmemory manager 320 may cause the surface to be copied from the memoryallocated in local video memory 291 to memory corresponding to thecommitted application private virtual address range and may then freethe allocated memory in local video memory 291. Virtual memory manager310 may then remap the committed application private virtual. addressrange to the memory corresponding to the committed application privatevirtual address range.

[0141] To bring the dynamic surface “into” aperture 292, physical memorymanager 320 may lock the committed application private virtual addressrange, whereby the operating system does not have permission to page outthe surface from the system memory corresponding to the applicationprivate virtual address range. Graphics processing unit aperture manager330 may allocate an address range in aperture 292 for redirection to thesurface and causes the address range in the graphics processing unitaperture to be mapped to the committed application private virtualaddress range.

[0142] To evict the dynamic surface from aperture 292, graphicsprocessing unit aperture manager 330 may unmap and free the allocatedgraphics processing unit aperture address range. Physical memory manager320 may unlock the committed application private virtual address range,whereby the operating system has permission to page out the surface fromthe system memory corresponding to the committed application privatevirtual address range.

[0143] Illustrative application programming interfaces are given belowfor dynamic video memory allocation and deallocation. NTSTATUSVidMmAllocateDynamic(   IN PVOID HwDeviceExtension,   IN DWORD dwFlags,  IN SIZE_T Size,   IN ULONG ulAlignment   OUT PHANDLE Handle); NTSTATUSVidMmFreeDynamic(   IN PVOID HwDeviceExtension,   IN HANDLE Handle);

[0144] Allocating a dynamic video memory allocation may be performed intwo steps. First, the virtual address of the allocation is allocated.Second, the actual GPU resources (e.g., physical video memory 291,non-local aperture 292) to store the bits are allocated (typically afterthe allocation creation time and upon the first application access tothe allocation).

[0145] Video memory manager 200 may store bits in system memory 130 byrotating the virtual address associated with the allocation back tosystem memory.

[0146] Video memory manager 200 may make an allocation visible throughnon-local aperture 292 by making sure the allocation bits are in systemmemory 130 then pinning down or locking the pages forming the allocationin physical system memory 130 so that the paging system doesn't sendthem to disk. Once the pages are locked, video memory manager 200 mayallocate a range in non-local aperture 292 that is visible to GPU 290and reprogram aperture 292 to redirect that range to the physical systemmemory pages. The allocation of address space in non-local aperture 292may be done through non-local aperture manager 330. Once visible oraccessible through non-local aperture 292, a dynamic video memoryallocation may be associated with a handle from non-local aperturemanager 330.

[0147] Video memory manager 200 may store bits of video memory 291 byusing physical video memory manager 320 to allocate a range of physicalvideo memory 291 in one of the physical memory segments defined by thedriver. Since dynamic allocation is visible to CPU 120, virtual memorymanager 310 looks at the segments that have been defined as visible byCPU 120. If more than one segment could hold the allocation, virtualmemory manager 310 chooses a segment. The choice may be made by tryingto maximize the balance of allocation in each segment. Rules for suchbalancing include: if a heap has a free hole big enough for theallocation, use it; if a heap has a lot more free memory than another,use it; use the heap with the oldest allocation; and the like.

[0148] Once the content of the surface is transferred to physical videomemory 291, the virtual address may be MEM_RESET so the memory managerof the operating system 134 won't send the pages to hard disk 141. Thevirtual address is rotated to the physical video memory address on thefirst lock, and remains referring to system memory 130. An illustrativeapplication programming interface is given below for beginning GPUaccess. NTSTATUS VidMmBeginGPUAccess(   IN PVOID HwDeviceExtension,   INPHANDLE phAlloc,   IN VIDMM_FENCE Fence,   OUT PBOOLEAN NonLocalVideo,  OUT PPHYSICAL_ADDRESS PhysAddr );

[0149] Hardware not supporting demand paging of video memory specifieswhich allocations will be used by the hardware before posting a commandbuffer to GPU 290 so that video memory manager 200 can make thoseallocations visible to GPU 290. The notification may be done throughVidMmBeginGPUAccess( ) API. (The duration of access may be controlled bya fencing mechanism, described below).

[0150] When VidMmBeginGPUAccess( ) is called, video memory manager 200verifies if the allocations are currently visible to GPU 290. If theallocations are not visible to GPU 290, video memory manager 200 bringsthe allocations to local physical video memory 291 or non-local videoaperture 292. Video memory manager 200 may go through the list ofallocations provided by driver 210 and try to make all of them visibleto GPU 290 by allocating physical video memory 291 or mapping throughnon-local aperture 292. When trying to allocate GPU resources, it'spossible that the allocation fails because there isn't enough free room.When this happens physical memory manager 320 or non-local aperturemanager 330 tries to evict some unused allocation to make room for thenew one. Video memory manager 200 may not be able to allocate memoryimmediately but may wait until GPU 290 is done with some surface. It ispossible that the function will fail to bring the allocations back inmemory. If the call fails, driver 210 may break down the command bufferin smaller pieces and call video memory manager 200 again for eachsubset of allocation.

[0151] Once the allocations are in physical video memory 291 or mappedthrough non-local aperture 292, they may remain there as long as theyare being used by GPU 290. To determine when an allocation is no longerin use, a fencing mechanism may be used. The fence may be a 64 bitmonotonic counter that is updated by the display hardware, GPU 290, eachtime a partial command buffer is completed.

[0152]FIG. 15 depicts the usage of a fence for coordination betweenvideo memory manager 200 and driver 210. Using the fence, video memorymanager 200 can determine if an allocation is currently busy (in use orshortly to be in use by GPU 290) by comparing the fence associated withan allocation with the last fence processed by GPU 290.

[0153] VidMmBeginGPUAccess may also acquire usage information about theallocations. Because driver 210 may notify video memory manager 200 eachtime GPU 290 requests the use of an allocation, this is a good place tobuild usage information. This usage information may be used by videomemory manager 200 when physical video memory 291 or non-local aperture292 is full and video memory manager 200 wants to find a candidateallocation for eviction. Each time an allocation is used, it may be putat the end of a list of allocations. Thus, when video memory manager 200wants to evict an allocation it can use that ordered list to find thebest candidate. Video memory manager 200 can also compare the last fenceof an allocation to the last fence GPU 290 processed to generate anestimate of how long ago the allocation was used.

[0154] When application 135 desires direct access to a dynamic surface,it may use the lock mechanism provided by Direct X runtime. Whenapplication 135 locks a surface, the runtime calls driver 210 with thelock request. Driver 210 then verifies which actual allocation to returnto application 135, and may call VidMmBeginUserAccess( ) function(illustrative API shown below) in video memory manager 200 to get thelinear address that was allocated for application 135 at creation time.If the virtual address is still referencing system memory 130, it may berotated to the current location of the surface in video memory 291before being returned. NTSTATUS VidMmBeginUserAccess(   IN PVOIDHwDeviceExtension,   IN HANDLE hAlloc,   OUT PVOID pvLin ); NTSTATUSVidMmEndUserAccess(   IN PVOID HwDeviceExtension,   IN HANDLE hAlloc );

[0155] VidMmBeginUserAccess( ) doesn't have to page in or evict theallocation; rather, it can safely keep the allocation at its currentlocation and let driver 210 access the allocation. If the allocation isin video memory 291 and is to be reclaimed while it's being accessed,the eviction process can ensure there's no loss of data during thetransfer. An illustrative eviction API is given below. NTSTATUSVidMmEvict(   IN PVOID HwDeviceExtension,   IN HANDLE hAlloc );

[0156] Hardware Considerations

[0157] There are some characteristics of GPU hardware that may affectthe implementation of video memory manager 200 and driver 210. Thosecharacteristics include a GPU programmable aperture and demand paging.

[0158] A GPU programmable aperture is used by some GPU hardware to givea virtual view of video memory 291 to GPU 290. Each application 135 hasits own virtual view of video memory 291 and each allocation done forthat application is allocated a contiguous range within a privateaperture. For hardware that doesn't support a GPU programmable aperture,video memory manager 200 may allocate contiguous blocks of memory.Allocating large contiguous block may be inefficient and may cause lotsof eviction. Allocating on a page basis may reduce fragmentation.

[0159] A GPU programmable aperture may be useful for protecting videomemory 291. Since each application may have its own private aperture,each application will see (via GPU 290) surfaces allocated for thatapplication. When GPU 290 is running in one application's context it isnot be able to access any video memory that wasn't allocated for thatapplication. If GPU 290 tries to access an address in the aperture thatwasn't allocated to that application, an interrupt is generated by GPU290 and video memory manager 200 may inject an exception in theapplication causing a fault (blocking any further rendering from thatapplication until it reinitialized its context).

[0160] Demand paging is a mechanism by which some GPUs indicate that theGPU desires access to a surface that is not currently cached in videomemory 291. With old GPU hardware, video memory manager 200 may confirmthat all surfaces referenced by a command buffer are cached in videomemory 291 before submitting the command buffer to GPU 290. Since thereis no way for video memory manager 200 to determine which surfaces willactually be used by GPU 290, it load all of those surfaces entirely. Ifa command buffer is built in user mode, kernel mode components may parsethe command buffer to load all those surfaces in video memory 291 beforesubmitting the command buffer to GPU 290. Since the command buffer islocated in uncached memory, reading from that buffer is veryinefficient. Also, a command buffer might be referencing more memorythan can actually be loaded at once, which requires that the driversubmit the command buffer into multiple sub-buffers.

[0161] In order to make this process more efficient, some GPUs cansupport demand paging. Demand paging may use a GPU programmableaperture. The aperture contains present flags for all pages of eachsurfaces. If a page being accessed is currently not present, the GPUsignals an interrupt. In response to the interrupt, video memory manager200 may take control of CPU 120 and bring the pages in from systemmemory 130 and restart the graphics operation that caused the fault.

[0162] Program code (i.e., instructions) for performing theabove-described methods may be stored on a computer-readable medium,such as a magnetic, electrical, or optical storage medium, includingwithout limitation a floppy diskette, CD-ROM, CD-RW, DVD-ROM, DVD-RAM,magnetic tape, flash memory, hard disk drive, or any othermachine-readable storage medium, wherein, when the program code isloaded into and executed by a machine, such as a computer, the machinebecomes an apparatus for practicing the invention. The invention mayalso be embodied in the form of program code that is transmitted oversome transmission medium, such as over electrical wiring or cabling,through fiber optics, over a network, including the Internet or anintranet, or via any other form of transmission, wherein, when theprogram code is received and loaded into and executed by a machine, suchas a computer, the machine becomes an apparatus for practicing theabove-described processes. When implemented on a general-purposeprocessor, the program code combines with the processor to provide anapparatus that operates analogously to specific logic circuits.

[0163] It is noted that the foregoing description has been providedmerely for the purpose of explanation and is not to be construed aslimiting of the invention. While the invention has been described withreference to illustrative embodiments, it is understood that the wordswhich have been used herein are words of description and illustration,rather than words of limitation. Further, although the invention hasbeen described herein with reference to particular structure, methods,and embodiments, the invention is not intended to be limited to theparticulars disclosed herein; rather, the invention extends to allstructures, methods and uses that are within the scope of the appendedclaims. Those skilled in the art, having the benefit of the teachings ofthis specification, may effect numerous modifications thereto andchanges may be made without departing from the scope and spirit of theinvention, as defined by the appended claims.

What is claimed:
 1. A video memory manager for a computer environmenthaving a main processing unit for executing an operating system and anapplication, a system memory, and a graphics processing unit having alocal video memory and an aperture that maps between a portion of thesystem memory and the graphics processing unit, the video memory managercomprising: a physical memory manager that manages the physical memoryof the local video memory and at least of portion of the physical memoryof the system memory; a graphics processing unit aperture manager thatmanages the memory mappings between a portion of system memory and thegraphics processing unit, such that video data in the system memory isaccessible to the graphics processing unit via the aperture; and avirtual memory manager that allocates virtual memory and maintainsmappings between the allocated virtual memory and the physical memory ofthe local video memory, the physical memory of the system memory, andthe physical memory of the system memory via the aperture.
 2. The videomemory manager as recited in claim 1, wherein the physical memorymanager further: allocates memory for one of a driver resource and anapplication resource in the local video memory and does not evict theallocated memory from local video memory; and the virtual memory managerfurther: allocates and commits a kernel virtual address range for one ofthe driver resource and the application resource; and maps the kernelvirtual address range to the memory allocated in local video memory. 3.The video memory manager as recited in claim 1, wherein the virtualmemory manager further: allocates and commits a kernel virtual addressrange for one of a driver resource and an application resource; and thephysical memory manager further: locks the committed kernel virtualaddress range, whereby the operating system does not have permission topage out one of the driver resource and the application resource fromthe system memory corresponding to the committed kernel virtual addressrange; and the graphics processing unit aperture manager: allocates anaddress range in the graphics processing unit aperture for redirectionto one of the driver resource and the application resource; and causesthe allocated address range in the graphics processing unit aperture tobe mapped to the committed kernel virtual address range.
 4. The videomemory manager as recited in claim 1, wherein the virtual memory managerfurther: allocates and commits a kernel virtual address range for one ofa driver resource and an application resource.
 5. The video memorymanager as recited in claim 4, wherein the physical memory managerfurther: allocates memory in the local video memory for containing ofone of the driver resource and the application resource; and causes oneof the driver resource and the application resource to be copied frommemory corresponding to the committed kernel virtual address range tothe memory allocated in the local video memory; and the virtual memorymanager further: maps the kernel virtual address range to the memoryallocated in the local video memory.
 6. The video memory manager asrecited in claim 5, wherein the physical memory manager further: causesone of the driver resource and the application resource to be copiedfrom the memory allocated in the local video memory to memorycorresponding to the committed kernel virtual address range; and freesthe memory allocated in the local video memory; and the virtual memorymanager further: frees the mapped kernel virtual address range.
 7. Thevideo memory manager as recited in claim 4, wherein the physical memorymanager further: locks the committed kernel virtual address range,whereby the operating system does not have permission to page out one ofthe driver resource and the application resource from the system memorycorresponding to the committed kernel virtual address range; and thegraphics processing unit aperture manager further: allocates an addressrange in the graphics processing unit aperture for redirection to one ofthe driver resource and the application resource; and causes the addressrange in the graphics processing unit aperture to be mapped to thecommitted kernel virtual address range.
 8. The video memory manager asrecited in claim 7, wherein the graphics processing unit aperturemanager further: unmaps and frees the address range allocated in thegraphics processing unit aperture; and the physical memory managerfurther: unlocks the committed kernel virtual address range, whereby theoperating system has permission to page out one of the driver resourceand the application resource from the system memory corresponding to thecommitted kernel virtual address range.
 9. The video memory manager asrecited in claim 1, wherein the virtual memory manager further:allocates and commits an application private virtual address range for asurface, the surface not being directly accessible by the applicationvia the main processing unit.
 10. The video memory manager as recited inclaim 9, wherein the physical memory manager further: allocates memoryin the local video memory for containing the surface; and causes thesurface to be copied from memory corresponding to the committedapplication private virtual address range to the memory allocated in thelocal video memory.
 11. The video memory manager as recited in claim 10,wherein the physical memory manager further: causes the surface to becopied from the memory allocated in the local video memory to memorycorresponding to the committed application private virtual addressrange; and frees the memory allocated in the local video memory.
 12. Thevideo memory manager as recited in claim 9, wherein the physical memorymanager further: locks the committed application private virtual addressrange, whereby the operating system does not have permission to page outthe surface from the system memory corresponding to the applicationprivate virtual address range; and the graphics processing unit aperturemanager further: allocates an address range in the graphics processingunit aperture for redirection to the surface; and causes the addressrange in the graphics processing unit aperture to be mapped to thecommitted application private virtual address range.
 13. The videomemory manager as recited in claim 12, wherein the graphics processingunit aperture manager further: unmaps and frees the allocated graphicsprocessing unit aperture address range; and the physical memory managerfurther: unlocks the committed application private virtual addressrange, whereby the operating system has permission to page out thesurface from the system memory corresponding to the committedapplication private virtual address range.
 14. The video memory manageras recited in claim 1, wherein the virtual memory manager further:allocates and commits an application private virtual address range for asurface, the surface being directly accessible by the application viathe main processing unit.
 15. The video memory manager as recited inclaim 14, wherein the physical memory manager further: allocates memoryin the local video memory for containing the surface; and causes thesurface to be copied from the memory corresponding to the committedapplication private virtual address range to the memory allocated in thelocal video memory; and the virtual memory manager further: maps thecommitted application private virtual address range to the memoryallocated in the local video memory.
 16. The video memory manager asrecited in claim 15, wherein the physical memory manager further: causesthe surface to be copied from the memory allocated in the local videomemory to memory corresponding to the committed application privatevirtual address range; and frees the allocated memory in the local videomemory; and the virtual memory manager further: remaps the committedapplication private virtual address range to the memory corresponding tothe committed application private virtual address range.
 17. The videomemory manager as recited in claim 14, wherein the physical memorymanager further: locks the committed application private virtual addressrange, whereby the operating system does not have permission to page outthe surface from the system memory corresponding to the applicationprivate virtual address range; and the graphics processing unit aperturemanager further: allocates an address range in the graphics processingunit aperture for redirection to the surface; and causes the addressrange in the graphics processing unit aperture to be mapped to thecommitted application private virtual address range.
 18. The videomemory manager as recited in claim 17, wherein the graphics processingunit aperture manager further: unmaps and frees the allocated graphicsprocessing unit aperture address range; and the physical memory managerfurther: unlocks the committed application private virtual addressrange, whereby the operating system has permission to page out thesurface from the system memory corresponding to the committedapplication private virtual address range.
 19. A method for video memorymanagement in a computer environment having a main processing unit forexecuting an operating system and an application, a system memory, and agraphics processing unit having a local video memory and an aperturethat maps between a portion of system memory and the graphics processingunit, the method comprising: managing the physical memory of the localvideo memory and at least of portion of the physical memory of thesystem memory; managing the memory mappings between a portion of systemmemory and the graphics processing unit, such that video data in thesystem memory is accessible to the graphics processing unit via theaperture; and allocating virtual memory and maintaining mappings betweenthe allocated virtual memory and the physical memory of the local videomemory, the physical memory of the system memory, and the physicalmemory of the system memory accessible via the aperture.
 20. The methodas recited in claim 19, further comprising: allocating memory for one ofa driver resource and an application resource in the local video memoryand not evicting the allocated memory from local video memory;allocating and committing a kernel virtual address range for one of thedriver resource and the application resource; and mapping the kernelvirtual address range to the memory allocated in local video memory. 21.The method as recited in claim 19, further comprising: allocating andcommitting a kernel virtual address range for one of a driver resourceand an application resource; locking the committed kernel virtualaddress range, whereby the operating system does not have permission topage out one of the driver resource and the application resource fromthe system memory corresponding to the committed kernel virtual addressrange; allocating an address range in the graphics processing unitaperture for redirection to one of the driver resource and theapplication resource; and causing the allocated address range in thegraphics processing unit aperture to be mapped to the committed kernelvirtual address range.
 22. The method as recited in claim 19, furthercomprising: allocating and committing a kernel virtual address range forone of a driver resource and an application resource.
 23. The method asrecited in claim 22, further comprising: allocating memory in the localvideo memory for containing of one of the driver resource and theapplication resource; causing one of the driver resource and theapplication resource to be copied from memory corresponding to thecommitted kernel virtual address range to the memory allocated in thelocal video memory; and mapping the kernel virtual address range to thememory allocated in the local video memory.
 24. The method as recited inclaim 23, further comprising: causing one of the driver resource and theapplication resource to be copied from the memory allocated in the localvideo memory to memory corresponding to the committed kernel virtualaddress range; freeing the memory allocated in the local video memory;and freeing the mapped kernel virtual address range.
 25. The method asrecited in claim 22, further comprising: locking the committed kernelvirtual address range, whereby the operating system does not havepermission to page out one of the driver resource and the applicationresource from the system memory corresponding to the committed kernelvirtual address range; allocating an address range in the graphicsprocessing unit aperture for redirection to one of the driver resourceand the application resource; and causing the address range in thegraphics processing unit aperture to be mapped to the committed kernelvirtual address range.
 26. The method as recited in claim 25, furthercomprising: unmapping and freeing the address range allocated in thegraphics processing unit aperture; and unlocking the committed kernelvirtual address range, whereby the operating system has permission topage out one of the driver resource and the application resource fromthe system memory corresponding to the committed kernel virtual addressrange.
 27. The method as recited in claim 19, further comprising:allocating and committing an application private virtual address rangefor a surface, the surface not being directly accessible by theapplication via the main processing unit.
 28. The method as recited inclaim 27, further comprising: allocating memory in the local videomemory for containing the surface; and causing the surface to be copiedfrom memory corresponding to the committed application private virtualaddress range to the memory allocated in the local video memory.
 29. Themethod as recited in claim 28, further comprising: causing the surfaceto be copied from the memory allocated in the local video memory tomemory corresponding to the committed application private virtualaddress range; and freeing the memory allocated in the local videomemory.
 30. The method as recited in claim 27, further comprising:locking the committed application private virtual address range, wherebythe operating system does not have permission to page out the surfacefrom the system memory corresponding to the application private virtualaddress range; allocating an address range in the graphics processingunit aperture for redirection to the surface; and causing the addressrange in the graphics processing unit aperture to be mapped to thecommitted application private virtual address range.
 31. The method asrecited in claim 30, further comprising: unmapping and freeing theallocated graphics processing unit aperture address range; and unlockingthe committed application private virtual address range, whereby theoperating system has permission to page out the surface from the systemmemory corresponding to the committed application private virtualaddress range.
 32. The method as recited in claim 19, furthercomprising: allocating and committing an application private virtualaddress range for a surface, the surface being directly accessible bythe application via the main processing unit.
 33. The method as recitedin claim 32, further comprising: allocating memory in the local videomemory for containing the surface; causing the surface to be copied fromthe memory corresponding to the committed application private virtualaddress range to the memory allocated in the local video memory; andmapping the committed application private virtual address range to thememory allocated in the local video memory.
 34. The method as recited inclaim 33, further comprising: causing the surface to be copied from thememory allocated in the local video memory to memory corresponding tothe committed application private virtual address range; freeing theallocated memory in the local video memory; and remapping the committedapplication private virtual address range to the memory corresponding tothe committed application private virtual address range.
 35. The methodas recited in claim 32, further comprising: locking the committedapplication private virtual address range, whereby the operating systemdoes not have permission to page out the surface from the system memorycorresponding to the application private virtual address range;allocating an address range in the graphics processing unit aperture forredirection to the surface; and causing the address range in thegraphics processing unit aperture to be mapped to the committedapplication private virtual address range.
 36. The method as recited inclaim 35, further comprising: unmapping and freeing the allocatedgraphics processing unit aperture address range; and unlocking thecommitted application private virtual address range, whereby theoperating system has permission to page out the surface from the systemmemory corresponding to the committed application private virtualaddress range.
 37. The method as recited in claim 19, furthercomprising: determining whether a set of graphics data to be stored in amemory is larger than a predefined size; and if the set of graphics datais larger than the predefined size, allocating memory that is a multipleof a page size of the operating system and storing only the set ofgraphics data in the allocated memory; and if the set of graphics datais not larger than the predefined size, allocating memory that is amultiple of a page size of the operating system and storing the set ofgraphics data in the allocated memory along with other sets of graphicsdata.
 38. The method as recited in claim 19, further comprising:querying the graphics processing unit for an indication of a commandthat was last processed by the graphics processing unit; receiving, fromthe graphics processing unit, an indication of a command that was lastprocessed by the graphics processing unit; and determining whichallocation to evict based on the indication of the command that was lastprocessed by the graphics processing unit.
 39. The method as recited inclaim 19, further comprising: queuing a marker into a rendering streamcommunicated to the graphics processing unit; querying the graphicsprocessing unit for an indication of a marker that was last processed bythe graphics processing unit; and determining which allocation to evictbased on the indication of the marker that was last processed by thegraphics processing unit.
 40. The method as recited in claim 38, whereinthe indication of the command that was last processed by the graphicsprocessing unit comprises a monotonic counter that is updated by thegraphics processing unit each time a partial command buffer iscompleted.
 41. The method as recited in claim 19, further comprising:associating a handle with graphics data, the handle being a unique andpermanent representation of the graphics data, the graphics data beingstored at a physical address; converting, upon a request including thehandle, the handle to the physical address.
 42. The method as recited inclaim 19, further comprising: timestamping graphics data with a fenceidentification; and marking the graphics data as a candidate foreviction based upon the timestamped fence identification.
 43. A methodfor managing memory in a computer environment having a main processingunit and a plurality of memories, the method comprising: allocatingphysical memory in a first one of the plurality of memories; storing aset of graphics data in the first one of the plurality of memories;allocating an virtual address range in at least one of the plurality ofmemories; mapping the virtual address range to the first one of theplurality of memories storing the graphics data; storing the graphicsdata in a second one of the plurality of memories; and remapping thevirtual address range to map to the second one of the plurality ofmemories.
 44. The method as recited in claim 41, further comprising:determining that a second set of graphics data is a first type ofgraphics data; and storing the second set of graphics data in apredefined one of the plurality of memories and not evicting the secondset of graphics data from the predefined one of the plurality ofmemories.
 45. The method as recited in claim 44, further comprising:determining that a second set of graphics data is a second type ofgraphics data; and storing the second set of graphics data in one of theplurality of memories and evicting the second set of graphics data fromthe one of the plurality of memories.
 46. The method as recited in claim41, further comprising: determining that a second set of graphics datais a first type of graphics data; and storing the second set of graphicsdata in a kernel of an operating system of the computer environment. 47.The method as recited in claim 41, further comprising: distinguishingbetween a first type of graphics data and a second type of graphicsdata; and storing the first type of graphics data in a kernel addressspace and storing the second type of graphics data in a user addressspace.
 48. A method for video memory management in a computerenvironment having a main processing unit for executing an operatingsystem and an application, a system memory, and a graphics processingunit having local video memory, the method comprising: managing thephysical memory of the local video memory and at least of portion of thephysical memory of the system memory; allocating virtual memory andmaintaining mappings between the allocated virtual memory and thephysical memory of the local video memory and the physical memory of thesystem memory.
 49. The method as recited in claim 48, furthercomprising: allocating memory for one of a driver resource and anapplication resource in the local video memory and not evicting theallocated memory from local video memory; allocating and committing akernel virtual address range for one of the driver resource and theapplication resource; and mapping the kernel virtual address range tothe memory allocated in local video memory.
 50. The method as recited inclaim 48, further comprising: allocating and committing a kernel virtualaddress range for one of a driver resource and an application resource.51. The method as recited in claim 50, further comprising: allocatingmemory in the local video memory for containing of one of the driverresource and the application resource; causing one of the driverresource and the application resource to be copied from memorycorresponding to the committed kernel virtual address range to thememory allocated in the local video memory; and mapping the kernelvirtual address range to the memory allocated in the local video memory.52. The method as recited in claim 51, further comprising: causing oneof the driver resource and the application resource to be copied fromthe memory allocated in the local video memory to memory correspondingto the committed kernel virtual address range; freeing the memoryallocated in the local video memory; and freeing the mapped kernelvirtual address range.
 53. The method as recited in claim 48, furthercomprising: allocating and committing an application private virtualaddress range for a surface, the surface not being directly accessibleby the application via the main processing unit.
 54. The method asrecited in claim 53, further comprising: allocating memory in the localvideo memory for containing the surface; and causing the surface to becopied from memory corresponding to the committed application privatevirtual address range to the memory allocated in the local video memory.55. The method as recited in claim 54, further comprising: causing thesurface to be copied from the memory allocated in the local video memoryto memory corresponding to the committed application private virtualaddress range; and freeing the memory allocated in the local videomemory.
 56. The method as recited in claim 48, further comprising:allocating and committing an application private virtual address rangefor a surface, the surface being directly accessible by the applicationvia the main processing unit.
 57. The method as recited in claim 56,further comprising: allocating memory in the local video memory forcontaining the surface; causing the surface to be copied from the memorycorresponding to the committed application private virtual address rangeto the memory allocated in the local video memory; and mapping thecommitted application private virtual address range to the memoryallocated in the local video memory.
 58. The method as recited in claim57, further comprising: causing the surface to be copied from the memoryallocated in the local video memory to memory corresponding to thecommitted application private virtual address range; freeing theallocated memory in the local video memory; and remapping the committedapplication private virtual address range to the memory corresponding tothe committed application private virtual address range.
 59. A methodfor video memory management in a computer environment having a mainprocessing unit for executing an operating system and an application, asystem memory, and a graphics processing unit having an aperture thatmaps between a portion of system memory and the graphics processingunit, the method comprising: managing the memory mappings between aportion of system memory and the graphics processing unit, such thatvideo data in the system memory is accessible to the graphics processingunit via the aperture; and allocating virtual memory and maintainingmappings between the allocated virtual memory and the physical memory ofthe local video memory, the physical memory of the system memory, andthe physical memory of the system memory accessible via the aperture.60. The method as recited in claim 59, further comprising: allocatingand committing a kernel virtual address range for one of a driverresource and an application resource; locking the committed kernelvirtual address range, whereby the operating system does not havepermission to page out one of the driver resource and the applicationresource from the system memory corresponding to the committed kernelvirtual address range; allocating an address range in the graphicsprocessing unit aperture for redirection to one of the driver resourceand the application resource; and causing the allocated address range inthe graphics processing unit aperture to be mapped to the committedkernel virtual address range.
 61. The method as recited in claim 59,further comprising: allocating and committing a kernel virtual addressrange for one of a driver resource and an application resource.
 62. Themethod as recited in claim 61, further comprising: locking the committedkernel virtual address range, whereby the operating system does not havepermission to page out one of the driver resource and the applicationresource from the system memory corresponding to the committed kernelvirtual address range; allocating an address range in the graphicsprocessing unit aperture for redirection to one of the driver resourceand the application resource; and causing the address range in thegraphics processing unit aperture to be mapped to the committed kernelvirtual address range.
 63. The method as recited in claim 62, furthercomprising: unmapping and freeing the address range allocated in thegraphics processing unit aperture; and unlocking the committed kernelvirtual address range, whereby the operating system has permission topage out one of the driver resource and the application resource fromthe system memory corresponding to the committed kernel virtual addressrange.
 64. The method as recited in claim 59, further comprising:allocating and committing an application private virtual address rangefor a surface, the surface not being directly accessible by theapplication via the main processing unit.
 65. The method as recited inclaim 64, further comprising: locking the committed application privatevirtual address range, whereby the operating system does not havepermission to page out the surface from the system memory correspondingto the application private virtual address range; allocating an addressrange in the graphics processing unit aperture for redirection to thesurface; and causing the address range in the graphics processing unitaperture to be mapped to the committed application private virtualaddress range.
 66. The method as recited in claim 65, furthercomprising: unmapping and freeing the allocated graphics processing unitaperture address range; and unlocking the committed application privatevirtual address range, whereby the operating system has permission topage out the surface from the system memory corresponding to thecommitted application private virtual address range.
 67. The method asrecited in claim 59, further comprising: allocating and committing anapplication private virtual address range for a surface, the surfacebeing directly accessible by the application via the main processingunit.
 68. The method as recited in claim 67, further comprising: lockingthe committed application private virtual address range, whereby theoperating system does not have permission to page out the surface fromthe system memory corresponding to the application private virtualaddress range; allocating an address range in the graphics processingunit aperture for redirection to the surface; and causing the addressrange in the graphics processing unit aperture to be mapped to thecommitted application private virtual address range.
 69. The method asrecited in claim 68, further comprising: unmapping and freeing theallocated graphics processing unit aperture address range; and unlockingthe committed application private virtual address range, whereby theoperating system has permission to page out the surface from the systemmemory corresponding to the committed application private virtualaddress range.